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Abstract 


A regression procedure is developed to link simultaneously a very large number of item response 
theory (IRT) parameter estimates obtained from a large number of test forms, where each form 
has been separately calibrated and where forms can be linked on a pairwise basis by means of 
common items. An application is made to forms in which a two-parameter logistic model is 
applied to dichotomous items and a general partial credit model is applied to polytomous items. 
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Linking test forms by use of item response theory (IRT) is a familiar activity when one 
reference form is equated to one base form (Hambleton, Swaminathan, & Rogers, 1991, ch. 9). In 
practice, testing programs often link one test form to another in circumstances in which multiple 
test forms are involved. In typical cases, new forms are equated based on one or more old 
forms, and these old forms have in turn been linked to earlier forms. If modifications in linking 
procedures are required for any reason, then their implementation can be an arduous task. In this 
report, an approach based on linear models is considered that permits simultaneous linking of a 
large number of forms without the need to produce a long sequence in which one form is linked to 
one or more previously used forms. The suggested approach is a generalization of the log-mean 
mean procedure (Mislevy & Bock, 1990) briefly described in Kolen and Brennan (2004). 

Section 1 describes the model used in the analysis of IRT true-score equating. Section 2 
describes the required regression analysis. Section 3 provides some concluding remarks. 

1 The Model 

To describe the general problem under study, consider the following situation. A given test 
has administrations 1 to T, where T > 1. In these T administrations, a finite and nonempty set 
J of v items is used, but not all examinees receive all items. Associated with Item j in J are 
response scores 0 to rj — 1, where rj > 2 is an integer. Associated with an examinee is a real 
proficiency variable 6. If an examinee has proficiency 6, then the probability Pj{k\9) > 0 that the 
examinee receives score k, 0 < k < rj — 1, on item j satisfies 

log[Pj(k\e)/Pj(k - 1|0)] = Daj{9 - bj + d jk ), 

where a 3 is a real and normally positive, bj and dj k are real, and z2k=i djk = 0 (Muraki, 1997). 
Thus one has a generalized partial credit model that reduces to a two-parameter logistic model if 
item j has r 3 = 2 categories. The item discrimination aj, the item difficulty bj, and the location 
parameters djk are unknown, except that djk obviously is 0 in the dichotomous case of rj = 2. 
The multiplier D is a known constant. It may be 1, 1.7, or 1.702. The values 1.7 and 1.702 are 
employed so that parameters from IRT models based on the logistic distribution function will be 
similar in value to IRT parameters based on the normal distribution function (Lord & Novick, 
1968, p. 400). At administration t, 1 < t < T, examinee i, 1 < i < Nt, is administered a nonempty 
subset Jn of the items in J. The response score for item j is then X t jt . It is assumed that, 


1 



conditional on the proficiency 0 t of examinee i, the Xijt, j in Ja , are independent. In addition, at 
administration t, the examinee population is assumed to have a normal proficiency distribution 
N(Bt,At) with mean Bt and standard deviation At. To identify parameters, let Ai = 1 and 
B i = 0, so that the proficiency distribution for Administration 1 is a standard normal distribution. 
This convention is reasonable if the form used in Administration 1 is regarded as the base form. 

Marginal maximum-likelihood estimation may be employed to determine the parameters At, 
Bt, a,j, bj, and dj k ; however, this approach is challenging with conventional software if J includes 
thousands of items. A less computationally demanding approach involves separate calibrations for 
each form. For each Administration t, let the set Jt. include each Item j such that the number Njt 
of examinees i with j in Jn exceeds some minimum threshold rrij > 0. To ensure that parameters 
can be estimated, assume that the number vt of elements of Jt is at least 3, and assume that each 
Item j in the set J is in Jt for some Administration t. A scaled version of the parameters dj, bj, 
and djk for each Item j in the set Jt of items associated with Administration t may be obtained 
with conventional software such as Parscale for estimation by maximum marginal likelihood. In 
conventional application of such software, the marginal distribution of the proficiency distribution 
is a standard normal distribution. Such a marginal distribution can be achieved by use of a linear 
transformation. The proficiency On of examinee i in administration t can be converted to the 
scaled proficiency 0' it = A' t 0 tt + B[ = {On — B t )/A t , where A' t = 1 /A t and B' t = —B t /A t , so that 0' it 
has a standard normal distribution. One may then apply a conventional analysis to the data from 
Administration t. In this analysis, the conditional probability Pj{k\0') that X\jt = k given 0' it = O' 
is Pj{k\A t 0' + B t ), so that 

log[p;{k\0')/p;{k-l\0')} = Daj{A t 0'+ Bt - bj + d jk ) 

= Da'jtiO' — b'j t + d'j kt ), 

where 

djt = At.aj = (. A t ) aj, 

b'jt = (bj — B t )/A t = A[bj + B' t , 

and 

d'jkt dj k /A t A t dj k . 
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Conventional analysis with programs such as Parscale provides maximum marginal-likelihood 
estimates a’- t , b'- t . and d'- kt for a'- t , and d'- kt , respectively, for each Item j in for each 
Administration t > 1. In typical linking attempts, for each Administration t > 1 and each Item j 
in J t , a chained approach is used to estimate the parameters A t . B t , A£, B[, dj , and dj*,. This 
approach requires availability of sequences of common items. Let Kt be the set of Items j common 
to both Administration t and to some Administration s < t, so that j is in both Jt and J s . The 
chained approach can only be used to link all Administrations t for 1 < t < T if Kt is nonempty 
for 2 < t < T. In other words, to each Administration t must correspond an Administration s < t 
such that these two administrations share common items. 

2 Regression Analysis 

Even if, for some Administration t > 1, the set Kt of common items is empty, it still may be 
possible to estimate all required parameters by use of a three-stage regression analysis. In the 
first stage, the item discrimination aj is estimated for each Item j in J. In the second stage, the 
item difficulty bj is estimated for each Item j. In the third stage, the location parameter djk is 
estimated for each score k, 0 < k < rj — 1, for each Item j. 

In the first stage, assume, as is normally the case, that each estimated item discrimination 
d ' ]f , j in J t , is positive. The equations 

log a jt = log A t + log a,j 

lead to the regression model in which log At and log dj are selected to minimize 

T 

EE [log dj t - log A t - log dj} 2 

t =i jeJt 

subject to the constraint that A\ = 1. The minimization problem is commonly encountered 
in the analysis of variance when an incomplete two-way layout is considered and an additive 
model is employed in which the variables represented by rows and columns are treated as nominal 
variables. The T administrations correspond to rows and the v items in J correspond to columns. 
The layout is incomplete because not all combinations of administrations and items are observed. 

Efficient computation of least-squares estimates requires some care due to the very large 
number of items to be considered. For each Item j, let Uj be the set of Administrations t such 
that j is in Jt, and let Uj be the number of elements of Uj. For Administrations t and t', let H tt ' 
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be the set of Items j such that j is in both Jt and Jf (j is a common item for Administrations t 
and t'), and let Gj be the set of positive integer t' < T such that H tt > is not empty, so that 
Administrations t and t r share common items. Solution of the normal equations shows that 


log = u ] 1 Y [ lo S a'jt ~ lo § M 

t.&Uj 

and 

logit - vi 1 Y l °gA t ' Y U ] 1 = v t 1 log “A ~ U J 1 loga A' 

t'£Gt j€Jt t'eUj 

(Scheffe, 1959, p. 114). Clearly A[ = 1 /A t . 

In the second stage, bj and B t are estimated by use of the equation 


(1) 

( 2 ) 


bjt-A-t — —Bt + bj. 


The estimates Bt and bj are selected to minimize 

Y ityt-At + Bt - bj } 2 

t =i jeJ t 

subject to the constraint B\ = 0. Again the regression analysis corresponds to an additive model 
for an incomplete two-way layout. Thus 


bj — u j 1 Y $jtAt + Bt. 

teUi 


and 


Bt-v t 1 Y Bt> Y u j 1 = ~ v t 1 E 


j&Jt 


b’jtAt - u i Y b'jt’At 


feUi 


t'eT j£H tt , 

It then follows that B' t = —Bt/At- 

In the third stage, the djk are selected to minimize 
T 

EE [djkt.At - djk ] 2 — EE I djkt^-t djk] 


t= 1 j&Jt 


jeJ teUj 


(3) 

(4) 


In this case, one can proceed as in the analysis of variance for a one-way layout with independent 
variable corresponding to items and with replications corresponding to administrations in which 
the item appears. Thus 

djk = uj 1 Y d'jkAt • ( 5 ) 

teUj 
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If T = 2, then the regressions reduce to log-mean mean equating (Mislevy & Bock, 1990). 
The set It 2 = H 12 consist of the common items j in both ,J\ and J 2 ■ Then A 2 is the geometric 
mean of the ratios a' 32 /d' 3l for j in K 2 - In addition, B 2 is the arithmetic mean of b'j 1 — b '^^2 for j 
in K\ 2 - If j is in J\ but not in J 2 , so that Item j is only encountered at Administration 1, then 
a,j = a ' 3 j, bj = b ' 3 ,, and dj k = d'- kv If j is in J 2 but not in Ji, so that Item j is only encountered at 
Administration 2, then 

dj = log a j2 - log A 2 , 
bj = fcj2^2 + 1^2) 


and 


djk dj k 2^2- 


If j is in K 2 , so that Item j is a common item, then 

log a! n + log a ' j2 - log A 2 

log aj = -^-, 

l _ b'ji + ^ 2^2 + -®2 
bj ~ ^ ’ 

and 

3 _ djk 1 + d' jk 2 A 2 

djk — 

(Kolen & Brennan, 2004, p. 162). 

The regression approach successfully defines estimates if, and only if, the set G of pairs 
{(j, t) : j € Jt, 1 < t < T} satisfy the inseparability requirement (Goodman, 1968) used in the case 
of two-way contingency tables with omitted cells. The inseparability requirement is essentially 
the requirement that all items can be linked together by use of common items for a sequence of 
administrations. Thus for each j and f in J must correspond a positive integer w such that, for 
Administrations t(z), 1 < z < w, j is in Jtm, j' is in Jt( w )i and is nonempty if z is a 

positive integer less than w. For a simple example, let T = 3, and let J be the set of integers 
from 1 to 5. Let Ji = {1,2,3}, let J 2 = {2,3,4}, and let J 3 = {3,4,5}. Then G is inseparable. 
For example, consider j = 1 and j' = 5. Then for w = 2, f(l) = 1 and f(2) = 3, j is in Jt(i), j' 
is in J t ( 3 ), and H t (i)t.( 2 ) = {3} is not empty. On the other hand, let T = 3, let J be the set of 
integers from 1 to 7, let Ji = {1,2,3}, let J 2 = {2,3,4}, and let J 3 = {5,6,7}. The inseparability 
condition fails, for H 13 and H 23 are empty. Thus j in J\ or J 2 cannot be linked to j' in J 3 . 
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In practice, the problem of minimization of the sums of squares requires some care when the 
number of items in J is very large. It is desirable that a computer routine apply (2) and (4) to 
solve T — 1 simultaneous equations rather than use a completely general approach with T + v — 1 
simultaneous equations to solve, T — 1 equations for the T — 1 administrations other than the 
initial administration and v equations for the v items in J. In SAS, the GLM procedure may be 
applied with the ABSORB option for the variable that specifies administrations as long as the 
data are sorted by order of administration. Elementary procedures are then needed to apply (1), 
(3), and (5), for GLM does not obtain logdj, bj, and djk if the ABSORB option is used. 

A change in the definition of the base form has no real impact on results. Let the base form 
be Form s for some positive integer s < T. The proficiency parameter On is converted to the 
proficiency parameter 9* t = (9a — B s )/A s , the item difficulty bj is converted to (bj — B s )/A s , the 
item discrimination aj is converted to a* = A s aj, and the location djk becomes d* k = djk/A s . 
Thus 

a*( 6 * t - b* + d* k ) = aj(e it - bj + d jk ). 

The mean of 9* t is B% = (Bt — B s )/A s , and the standard deviation of 9* t is A* = At/A s . Consider 
minimization of 

T 

EE [log a' jt - log A* t - log d *] 2 

t =i jeJt 

subject to the constraint that A* = 1, minimization of 

^^[6 ' jt A* t +B* t -b*j\ 2 

t= i j&Jt 

subject to the constraint B* = 0, and minimization of 

t= i jeJ t 

Then A* t = A t /A s , a* = A s dj, B t * = ( B t - B s )/A s , b* = (bj - B s )/A s , and d* k = djk/A s . 

3 Conclusion 

The approach proposed in this report is readily applied even when the linkages between 
forms are quite complex. Common software programs such as Parscale and SAS may be used 
in calculations. If outliers are a concern, then standard methods of residual analysis may be 
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employed to identify unusually large residuals from each required regression analysis (Draper & 
Smith, 1998). If needed, it is possible to remove selected items from the equating computations. 
Because the estimated difficulty b' jt is relatively unstable if the estimated discrimination a' jt is 
unusually small, it may also be desirable to remove items from equating if a'j t is unusually small, 
say less than 0 . 2 . 

The proposed approach does satisfy the basic requirement that all parameters are estimated 
with increasing accuracy if sample sizes become large at each administration and if all model 
assumptions are satisfied. The approach does not exploit information concerning accuracy of 
parameter estimates or correlations of parameter estimates, so it is not fully efficient in a statistical 
sense. More efficient approaches require more computational resources than are typically available 
in operational programs. 

Once the estimates At, Bt, dj , bj , and djk are available, a variety of methods can be employed 
to equate scores from different administrations (Kolen Sz Brennan, 2004, ch. 6 ). In the case of 
IRT true-score equating, let scores be provided for Administration t based on the subset F t of 
items, where T) is included in Jt- Thus Examinee i has a raw score 

Su = J2 Xi F 

jeF t 

with range from 0 to -St = Y2jeFt( r j ~ !)• The test characteristic curve for Administration t is 
estimated to be r 

f t (e)= J2J2(k-i)P jk (e), 

j&G t k= 1 

where 

iog[Pj(k\e)/Pj(k - 1 | 0 )] = Ddj(e - bj + d jk ). 

The raw score of 0 for Administration t is converted to a raw score of 0 for Administration 1 , while 
the raw score of st for Administration t is converted to a raw score of si for Administration 1. For 
0 < s < st, a raw score of s for Administration t is converted to a raw score of Ti(T) _ 1 (s)) for 
Administration 1 . 

The approach applied to the two-parameter logistic model and general partial credit model in 
this report applies with little change if other models are used. For example, it is a simple matter 
to modify analysis for use with a three-parameter logistic model or for a two-parameter normal 
ogive model. Note that in the case of a three-parameter logistic model, an added stage would 
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be employed in which the guessing parameter for an Item j would be averaged over the reported 
values for all forms in which it appears. 
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